A New Baseline Estimation Method Applied to Arabic Word Recognition

نویسندگان

  • Fouad Slimane
  • Slim Kanoun
  • Rolf Ingold
  • Adel M. Alimi
چکیده

We analyse in this paper the impact of different baseline identification approaches in the case of single word recognition. We show that classical baseline identification approaches using horizontal projection histograms may fail in detecting accurately the baseline of short words, impacting the overall processing chain and inducing errors. From this observation, we propose a novel approach based on stochastic models able to propose probable baseline regions from characters features. Once the most probable baseline region is detected, we fine tune the position of the baseline with an horizontal projection histogram. We ran our experiments in the case of a printed word recognition task using the APTI database and observed a significant increase of performance. Keywords-HMM; GMM; arabic recognition; baseline;

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Component-based Segmentation of Words from Handwritten Arabic Text

Efficient preprocessing is very essential for automatic recognition of handwritten documents. In this paper, techniques on segmenting words in handwritten Arabic text are presented. Firstly, connected components (ccs) are extracted, and distances among different components are analyzed. The statistical distribution of this distance is then obtained to determine an optimal threshold for words se...

متن کامل

End-Shape Analysis for Automatic Segmentation of Arabic Handwritten Texts

complies with the regulations of the University and meets the accepted standards with respect to originality and quality. Word segmentation is an important task for many methods that are related to document understanding especially word spotting and word recognition. Several approaches of word segmentation have been proposed for Latin-based languages while a few of them have been introduced for...

متن کامل

Performance of hidden Markov model and dynamic Bayesian network classifiers on handwritten Arabic word recognition

This paper presents a comparative study of two machine learning techniques for recognizing handwritten Arabic words, where hidden Markov models (HMMs) and dynamic Bayesian networks (DBNs) were evaluated. The work proposed is divided into three stages, namely preprocessing, feature extraction and classification. Preprocessing includes baseline estimation and normalization as well as segmentation...

متن کامل

Rescoring N-Best Hypotheses for Arabic Speech Recognition: A Syntax- Mining Approach

Improving speech recognition accuracy through linguistic knowledge is a major research area in automatic speech recognition systems. In this paper, we present a syntax-mining approach to rescore N-Best hypotheses for Arabic speech recognition systems. The method depends on a machine learning tool (WEKA-3-6-5) to extract the N-Best syntactic rules of the Baseline tagged transcription corpus whic...

متن کامل

Efficient System for Speech Recognition using General Regression Neural Network

In this paper we present an efficient system for independent speaker speech recognition based on neural network approach. The proposed architecture comprises two phases: a preprocessing phase which consists in segmental normalization and features extraction and a classification phase which uses neural networks based on nonparametric density estimation namely the general regression neural networ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012